Chinese Character Expansion for Retrieving Japanese Paraphrases

نویسندگان

  • Takenobu Tokunaga
  • Yoshiki Tezuka
  • Hozumi Tanaka
چکیده

This paper proposes two methods of query expansion for retrieving paraphrase candidates indexed by Kanzi (Chinese) characters. The idea is to calculate similarity between Kanzi characters based on an ordinary thesaurus defining relations between words. The local analysis method calculates similarity of Kanzi characters based on a semantic class to which words in a query belong, in contrast, the global analysis method calculates similarity based on whole semantic classes in the thesaurus. The methods were evaluated by using the EDR concept dictionary which defines about 410,000 concepts. In the experiments, both headwords and concept descriptions of dictionary entries were indexed by Kanzi characters, and concept descriptions were retrieved by giving a headword as a query. The experiments showed that Kanzi character expansion is significantly effective, and the global analysis method is better than the local analysis method in recall.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Paraphrasing Japanese Noun Phrases using Character-based Indexing

This paper proposes a novel method to extract paraphrases of Japanese noun phrases from a set of documents. The proposed method consists of three steps: (1) retrieving passages using character-based index terms given a noun phrase as an input query, (2) filtering the retrieved passages with syntactic and semantic constraints, and (3) ranking the passages and reformatting them into grammatical f...

متن کامل

Chinese Paraphrases Acquiring Based on Random Walk N Steps

Jun Ma, Yujie Zhang, Jinan Xu, Yufeng Chen (Beijing Jiaotong University, Beijing, 100044, China ) Abstract: Conventional “pivot” approach of acquiring paraphrases from bilingual corpus has limitations, where only candidated paraphrases within two steps are considered. In this paper, we propose a graph based model of acquiring paraphrases from phrases translation table. First, we describe a grap...

متن کامل

Minimally Supervised Method for Multilingual Paraphrase Extraction from Definition Sentences on the Web

We propose a minimally supervised method for multilingual paraphrase extraction from definition sentences on the Web. Hashimoto et al. (2011) extracted paraphrases from Japanese definition sentences on the Web, assuming that definition sentences defining the same concept tend to contain paraphrases. However, their method requires manually annotated data and is language dependent. We extend thei...

متن کامل

Extracting Paraphrases of Japanese Action Word of Sentence Ending Part from Web and Mobile News Articles

In this research, we extract paraphrases from Japanese Web news articles that are long and aimed at displaying on personal computer screens and mobile news articles that are short and compact and aimed at mobile terminals’ small screens. We have collected them for more than two years, and aligned them at article level and then at sentence level. As the result, we got more than 88,000 pairs of a...

متن کامل

Extracting Paraphrases of Japanese Sentence Ending Part From Web and Mobile News Articles

In this research, we extract paraphrases from Japanese Web news articles that are long and aimed at displaying on personal computer screens and mobile news articles that are short and compact and aimed at mobile terminals’ small screens. We have collected them for more than two years, and aligned them at article level and then at sentence level. As the result, we got more than 88,000 pairs of a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004